® 



J 



Europaisches Patentamt 
European Patent Office 
Office europeen des brevets 



<2) Publication number: 



0 358 628 

A2 



® 



EUROPEAN PATENT APPLICATION 



@ Application number: 89850293.5 
© Date of filing: 06.09.89 



<§) Int. CI. 5 : G 05 D 1/02 



m 



© Priority: 06.09.88 US 241059 

@ Date of publication of application: 
14.03.90 Bulletin 90/11 

© Designated Contracting States: 

AT BE CH DE ES FR GB GR IT U LU NL SE 



© Applicant: TRANSITIONS RESEARCH CORPORATION 
15 Great Pasture Road 
Danbury.CT 06810 (US) 

@ inventor: Evans, John Martin, Jr. 
15 Maple Lane 
Brookfield, CT 06804 (US) 

Welman, Cari Frederick Relnhold 
26 High Point Road 
Westport, CT 06880 (US) 

King, Steven Joseph 
34 Pilgrim Trail 
Woodbury, CT 06798 (US) 

© Representative: Hagelback, Evert Isidor et al 

c/o AB Electrolux Corporate Patents & Trademarks 
S-105 45 Stockholm (SE) 



> 



03 



O 

o 

3 



00 
CM 
<D 

00 
U> 
CO 



CL 
LU 



@ Visual navigation and obstacle avoidance structured light system. 

© A vision system for a vehicle, such as a mobile robot (10) 
includes at least one radiation projector (14,16) which projects 
a structured beam of radiation into the robot's environment.The 
structured beam of radiation (14a, 16a) preferably has a 
substantially planar pattern of sufficient width to encompass the 
immediate forward path of the robot and also to encompass 
laterally disposed areas in order to permit turning adjustments. 
The vision system further includes an imaging (12) sensor such 
as a CCD imaging device having a two-dimensional field of view 
which encompasses the immediate forward path of the robot. 
An image sensor processor (18) includes an image memory 
(18A) coupled to a device (18D) which is operable for accessing 
the image memory. Image processing is accomplished in part 
by triangulating the stored image of the structured beam 
pattern to derive range and bearing, relative to the robot, of an 
object being illuminated. A navigation control system (20) of the 
robot inputs data from at least the vision system and Infers 
therefrom data relating to the configuration of the environment 
which lies in front of the robot. The navigation control system 
generates control signals which drive propulsion and steering 
motors in order to navigate the robot through the perceived 
environment 
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Description 

Visual navigation and obstacle avoidance structured light system 



This invention relates generally to a navigation and 
obstacle avoidance vision system for a moving 
vehicle, such as a mobile robot and, in particular, to a 5 
vision system which includes at least one structured, 
substantially planar radiation pattern which is pro- 
jected along a path of the vehicle and which further 
includes an image sensor for sensing reflected 
radiation. . w 

An autonomous vehicle, such as a mobile robot, 
typically comprises some type of sensor system for 
sensing an environment through which the vehicle 
navigates. Preferably, the sensor system has the 
capability to detect obstacles within the path of the 15 
robot so that appropriate action may be taken. This 
action may include altering the path of the robot in 
order to steer around the obstacle. Alternatively, a 
sensed object may represent a navigation landmark, 
such as a support post, door frame, or wall, which 20 
the robot uses as a registration reference in 
following a preprogrammed trajectory. Systems 
employing ultrasonic detectors, mechanical contact 
devices and laser ranging apparatus are known in 
the art. Other systems which include a camera to 25 
observe the environment and a passive image 
processing system are also known. 

A problem associated with ultrasonic detectors 
relates to the difficulty in obtaining reliable and 
consistent range signals in an environment which 30 
normally includes a number of objects having 
differing specular reflection characteristics. The 
object also typically differ in size, surface character- 
istics and orientation relative to the ultrasound 
transmitter. A problem associated with mechanical 35 
contact devices relates at least to a lack of 
resolution and to a requirement that the obstacle 
actually be contacted in order to generate a signal. 
For some applications, such as navigation through a 
workplace or a hospital, the obstacle may be a 40 
human being. As can be appreciated, for these 
applications physical contact with the obstacle may 
be undesirable. Laser ranging systems are expens- 
ive, bulky, and consume substantial power. Tradi- 
tional passive scene analysis vision systems require 45 
large amounts of computing power, are relatively 
slow and often yield erroneous results. Typically the 
interpretation of data is too slow to be useful for real 
time navigation, and may prove erroneous, such as 
interpreting a shadow as an object, which results in 50 
navigation errors. 

It has also been known to provide visual markers 
or "beacons" within the robot's environment. Such 
beacons are undesirable in that they introduce 
additional cost and complexity to the system and 55 
constrain the motion of the robot to a region wherein 
the beacons are visible. 

Commercial applications of mobile robots in the 
service sector include floor cleaning, aids to the 
handicapped, hospital delivery systems, mail carts, 60 
and security. These applications require robust, 
reliable navigation using sensors which are low in 
cost and power comsumption while providing real- 



time maneuvering data. 

It is therefore one object of the invention to 
provide a simplification of vision and vision process- 
ing for a mobile robot. 

It is another object of the invention to provide a 
vision system for a mobile robot, the systen 
requiring a minimum of image processing complexity 
while yet having an Image resolution which is 
sufficient for guiding the robot through an environ- 
ment. 

It is a further objcet of the invention to provide a 
vision system for a mobile robot which does not 
require beacons or other environmental modification 
means to be disposed within the robot's environ- 
ment. 

It is another object of the invention to provide a 
vision system for a mobile robot which provides a 
complete and unambiguous interpretation of ob- 
stacles and landmarks relevant to navigation which 
lie in the path of the robot while having a minimal 
complexity, cost and power consumption as com- 
pared to conventional passive image analysis sys- 
tems. 

It is one still further object of the invention to 
provide a vision system for a mobile robot which 
operates in a high speed manner and which permits 
the continuous, adaptive motion of the robot 
through the robot's environment. 



SUMMARY OF THE INVENTION 

The aforedescribed problems are overcome and 
the objects of the invention are realized by an object 
detection or vision system for a vehicle, such as a 
mobile robot which, in accordance with methods 
and apparatus of the invention, includes at least one 
radiation projector which projects a structured beam 
of radiation into the robot's environment. The 
structured beam of radiation preferably has a 
substantially planar pattern of sufficient width to 
encompass the immediate forward path of the robot 
and also to encompass laterally disposed areas in 
order to permit turning adjustments. The brightness, 
spectral characteristics and pulse repetition rate of 
the structured beam are predetermined to maximize 
signal to noise ratio in an imaging sensor over a 
variety of ambient lighting conditions, while consum- 
ing minimal energy. 

The object detection system of the invention 
further includes an imaging sensor which includes 
an electronic camera having a two-dimensional field 
of view which encompasses the immediate forward 
path of the robot. An image sensor processor may 
include a frame grabber, or image memory, coupled 
to a data processing device which is operable for 
accessing the image memory wherein the field of 
view of the camera is represented as binary data. 
Image processing is accomplished in part by 
triangulating the stored image of the structured 
beam pattern to derive at least range and bearing 
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information, relative to the robot, of an object 
reflecting the substantially planar structured beam of 
radiation. 

A motion control system of the robot inputs data 
from at least the vision system and infers therefrom 
data relating to the configuration of the environment 
which lies in front of the robot. The motion control 
system generates control signals which drive pro- 
pulsion and steering motors in order to navigate the 
robot through the perceived environment. 



BRIEF DESCRIPTION OF THE DRAWINGS 

The foregoing aspects of the invention will be 
made more apparent in the ensuing detailed de- 
scription of the invention read in conjunction with 
the accompanying drawings wherein: 

Fig. 1a is an illustrative block diagram 
showing a mobile robot, constructed and 
operated in accordance with one embodiment 
of the invention, which includes a camera 
having a downward pointing field of view and 
being disposed above two forwardly projecting 
structured beams of radiation; 

Fig. 1b is a block diagram of the image 
processor 18 of Fig. 1a; 

Figs. 2a and 2b show a side view and a top 
view, respectively, of one embodiment of a 
structured beam projector, the projector com- 
prising a flash tube, a cylindrical mirror and a 
plurality of cylindrical lens elements; 

Fig. 2c shows a side view of another 
embodiment of a structured beam projector, 
the projector comprising a flashtube, a cylindri- 
cal mirror and a plurality of apertures; Figs. 3a 
and 3b are lateral view and a top view, 
respectively, of structured beam patterns pro- 
jected by the robot of Fig. 1 ; 

Fig. 4 is an illustrative side view of a mobile 
robot constructed in accordance with another 
embodiment of the invention, the robot having 
an upper, downward pointing structured beam 
projector disposed above a camera, the robot 
further comprising a pair of beam projectors for 
projecting planar beams which are orientated 
substantially orthogonally with respect to a 
lower, horizontal beam projector; 

Fig. 5 is a frontal view of the robot of Fig. 4; 

Fig. 6 is a diagram which illustrates a 
processed field of view of the robot of Figs. 4 
and 5. 

Fig. 7 is an illustrativ view of the successive 
reflections of vertically orientated structured 
beam projectors from successively more dis- 
tant vertical objects; and 

Fig. 8 is an illustrative view of the reflections 
from objects within a robot's environment, the 
reflections being due to an obliquely projecting 
structured beam projector. 



DETAILED DESCRIPTION OF THE INVENTION 
Referring now to Fig. 1a there is shown a side view 



of one embodiment of a mobile robot 10 comprising 
an electronic imaging device, such as a camera 12, 
and a plurality of structured beam projectors, namely 
an upper projector 14 and a lower projector 16. In 
5 accordance with the invention this optical configura- 
tion both detects and measures the position of 
objects lying within or closely adjacent to the forwad 
path of the robot 10. These objects might be 
obstacles such as furniture or pedestrians. The 

10 object may also be reference surfaces, such as walls 
and door frames. 

The camera 12 preferably includes a CCD imaging 
device having a square or rectangular field of view 
(FOV) which is directed obliquely downward such 

15 that it encompasses the forwad path of the robot 10 
in the immediate maneuvering vicinity. The camera 
12 generates a plurality of pixels, individual ones of 
which have a value indicative of an intensity of 
radiation incident upon a corresponding surface 

20 area of the camera radiation sensing device. The 
structured beams 14a and 16a which are projected 
by projectors 14 and 16, respectively, have the 
general form of a plane or slit of radiation disposed 
to intersect the field of view in a region most likely to 

25 be occupied by furniture, walls, pedestrians, or other 
obstacles. 

Robot 10 further comprises an image processor 
18 which is coupled to the output of camera 12. 
Image processor 18, as shown in greater detail in 

30 Fig. 1b, comprises a video memory 18A which stores 
a representation of one video frame output of 
camera 12. An input to video memory 18A may be 
provided by an analog to digital (A/D) convertor 18B 
which digitizes the analog output of camera 12. The 

35 digital output of A/D 18B may form an address input 
to a lookup table (LUT) 18C wherein pixel brightness 
values may be reassigned. The LUT 18C may also be 
employed for image thresholding and/or histogram 
correction. Image processor 18 further comprises 

40 an image processing device, such as a microcom- 
puter 18D, which is coupled to the video memory 
18A and which is operable for reading the stored 
video frame data therefrom. Image processor 18 
further comprises memory 18E which includes 

45 memory for storing program data. This program data 
is operable for performing at least triangulation 
calculations upon the stored image frame data, this 
triangulation computation being described in detail 
hereinafter. Image processor 18 may further com- 
fit? prise memories 18F and 18G each of which stores a 
data structure, such as a lookup table, associated 
with a particular projector 14 or 16. Individual entries 
in each table correspond at least to range and 
bearing information associated with individual pixels 

55 of an image frame. This aspect of the invention will 
also be described in detail below. Image processor 
18 may have a plurality of outputs coupled to 
projectors 14 and 16 for energizing the projectors 
for a predetermined period of time. As will be 

60 described the operation of the projectors 14 and 16 
are synchronized to the operation, or frame rate, of 
the camera 12 while being desynchronized to each 
other. An output of image processor 18 which is 
expressive of position information relating to objects 

65 within the FOV of camera 12 may be supplied, via an 
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RS-232 or parallel data link, to a navigation control 
processor 20 which derives navigation data based 
upon the perceived image of the environment. Such 
data may be employed to steer the robot down a 
straight path or may be employed to alter the path of 
the robot in order to avoid an obstacle within the 
path of the robot. An output of navigation control 
processor 20 is supplied to a drive and steering 
control 22 which has outputs coupled to drive and 
steering wheels 24. The wheels 24 are in contact with 
a supporting surface 26 which is typically a floor. 
Navigation control processor 20 may receive an 
output from the drive and steering control 22, the 
output being expressive of odometer readings which 
relate to the distance traveled by the robot 10. 
Navigation control processor 20 typically comprises 
a data processing device having associated memory 
and support circuitry. An enclosure is provided to 
contain the aforementioned apparatus and to pro- 
vide protection therefore. 

The camera 12 may be a model TM440 CCD 
camera manufactured by Pulnix. The camera 12 may 
have a relatively short foca! length of, for example, 
6.5 mm in order to maximize the field of view. 
Microcomputer 18D may be an 80286 microproces- 
sor device manufactured by Intel. LUT 18C and video 
memory 18A may be contained with a frame grabber 
pc-board such as a type manufactured by Coreco or 
Imaging Technologies. In general, image processor 
1 8 may conform to a standard computer architecture 
having printed circuit boards coupled to a common 
backplane and communicating over a bus. It should 
be realized that the invention may be practiced by a 
number of different means and should not be 
construed to be limited to only that disclosed herein. 

Although the protectors 14 and 16 may be 
operable for projecting planar beams having any 
desired spectral characteristics a preferred embodi- 
ment of the invention employs a broad, near infrared 
(IR) light source having wavelengths within the 
range of approximately 700 to approximately 1000 
nanometers (nm). Near-IR radiation is preferable for 
a number of reasons. Near-IR radiation Is unobtru- 
sive to humans which may be sharing the environ- 
ment with the robot 10. CCD imaging sensors, which 
are preferred because of low cost and power 
consumption, are sensitive to near- infrared radia- 
tion. In addition, and relating to projectors 14 and 16, 
infrared light emitting diodes (LEDs) are energy 
efficient and available at low cost. In this regard it 
has been found that laser diode devices consume 
more energy per emitted power and typically provide 
a relatively narrow spectrum which may not optimally 
match the sensitivity of the camera 12. However, it 
should be realized that the invention may be 
practiced with any source, such as a incandescent 
lamp, laser, flash lamp or light emitting diode, having 
wavelengths which are efficiently detected by a 
radiation sensor. Furthermore it should be realized 
that the planar radiation pattern may be formed by 
any of a number of suitable techniques including, but 
not limited to, providing a knife-edged aperture, 
focussing and/or collimating the beam with lens, or 
mechanically scanning either the source of radiation 
or a reflecting element. 



The energy of the output radiation beams 14a and 
16a are preferably of sufficient magnitude to be 
distinguishable from ambient lighting, while consum- 
ing minimal power. In indoor environments inter- 
5 ference from fluorescent lighting, which peaks in the 
visible spectrum, may be minimized by employing an 
infrared pass filter 12a at the input to the camera 12; 
thereby improving the system signal to noise ratio. A 
low duty cycle of the projected planar beam further 

10 improves efficiency as well. That Is, the light source 
may be energized for a few milliseconds, corre- 
sponding to the interval of image exposure, after 
which the light source may be de-energized to 
conserve energy. For example, if the vehicle is 

15 travelling at one meter per second, a relatively rapid 
rate for a mobile robot sharing space with humans, 
one flash per 100 milliseconds results in an image 
being obtained once every 10 centimeters of floor 
travel. Many normal sized obstacles, such as 

20 furniture, are larger than this increment of travel. 
Thus, this rate of Image exposure is sufficient for 
avoiding most normal sized obstacles. 

Another technique to imprive signal to noise ratio 
while conserving energy is to acquire two images in 

25 quick succession, one flashed and one non-flashed, 
and then to subtract on a pixel-by-pixel basis the 
brightness values of the non-flashed image from 
those of the flashed image. This technique is known 
in the art as image subtraction and results in the 

30 reflected pattern due to the structured radiation 
projector being emphasized. 

A strobe light source having an output planar 
beam forming means, such as a knife-edge aperture, 
may be empolyed as a structured beam projector. 

35 The short duration of a typical strobe flash implies 
low duty cycle and hence an increased energy 
efficiency. If a xenon strobe source is employed it is 
desirable to include an infrared pass filter at the 
strobe output to reduce annoyance to humans 

40 sharing trie maneuvering space with the robot. 

In accordande with one embodiment of the 
invention there is illustrated in Figs. 2a and 2b a 
beam projector, such as the beam projector 14 of 
Fig. 1a, which comprises an elongated, substantially 

45 cylindrical xenon flash tube 28 which is interposed 
between a circular cylindrical reflector 28a and an 
aspheric cylindrical lens 28b. Lens 28b may have a 
focal length of approximately 0.5 inches and oper- 
ates to focus both the direct and reflected aspheric 

50 cylindrical lens 28c. The flashtube 28 preferably is 
positioned at the focal point of cylindrical reflector 
28a so that direct and reflected light rays are 
co-aligned on entering lens 28b. The mirror reflector 
28a thus increases the energy efficiency of the 

55 structured light system by gathering light emitted 
from the back of the flash tube and sending it back in 
the same direction as light emitted directly from the 
front of the tube. Lenses 28b and 28c may be 
Fresnel lenses in that such lenses are preferred to 

60 solid glass or plastic in that they are lighter, thinner, 
and can accommodate shorter focal lengths without 
spherical aberration. Shorter focal lengths are 
preferred because they collect light from a wider 
angle, so less radiant energy is lost. Cylindrical lens 

65 28c may also have a focal lenght of approximately 0.5 
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inches and operates to collimate the radiation and to 
provide a planar radiation beam output. As was 
previously stated, a pass band filter 28d may be 
provided for filtering out substantially all wave- 
lengths except those in a desired range, such as a 
range of 700 to 1000 nm. 

As shown in Fig. 2c lenses 28b and 28c may be 
replaced by slit apertures 28e and 28f which 
collimate emitted light from flash tube 28. This 
arrangement is more wasteful of energy, but is 
simpler in design and less costly than the provision 
of lenses to collimate the radiation. 

In general, It has been determined that the width 
of the projected planar beam, or radiation stripe 
pattern, is preferably broad enough to span the path 
in fomt of the robot, but simple enough to afford 
unambiguous interpretation. Thus, a single radiation 
stripe is preferred for a single image capture, 
although several stripes may be flashed in succes- 
sion. For example, two horizontal radiation stripes 
projected alternately and viewed in consecutive 
images, which project at approximately ankle level 
and chair seat level, have been found to be useful for 
indoor navigation to detect low and medium height 
obstacles within the environment. If there are no 
obstacles at these levels to reflect the radiation 
stripes the image viewed by the camera 12 is 
substantially blank. Thus a very simple "no image" 
condition can be readily detected without significant 
signal processing, allowing the robot to proceed at 
top speed. 

In presently preferred embodiments of the inven- 
tion the structured beam projectors 14 and 16 and 
the camera 12 are mounted rigidly on the body of the 
robot 10 such that triangulation geometry process- 
ing which relates pixel position to an environmental 
position remains fixed in time. However, it is also 
possible to employ a movable camera and/or 
movable beam projectors whose relative positions 
and orientations may be varied. In this case, more 
complex imaging processing is required to account 
for the changes in position. It is also within the scope 
of the invention to provide for only one beam 
projector. 

In accordance with one aspect of the invention 
relatively nearby objects within a range of 2-10 feet 
are illuminated with a structured radiation pattern, 
preferably a stripe of radiation. The image of the 
structured radiation reflecting to an Image sensor, 
such as the CDD camera 12, is analyzed to 
determine the range, bearing and elevation geo- 
metry of objects relative to the robot 10 and the 
plane of the floor 26. The structure and pattern of 
light preferably provides azimuth coverage of ap- 
proximately 90 degrees, leaving no gaps. With the 
span of the structured pattern being about 90 
degrees the peripheral illuminance is preferably at 
least 50% of central illuminance. Illuminance fluctua- 
tions along the pattern boundary are generally 
tolerably to magnitudes of 25%, insofar as they may 
be compensated for by an intensity value lookup 
table. The cross section of the beam profile is 
preferably sharp enough such that there is a drop 
from substantially full illumination to substantially no 
illumination within a distance of approximately two 



inches on a perpendicularly illuminated surface at a 
distance of ten feet. This change in illumination 
decreases proportionally for closer surfaces, to one 
half inch at 2.5 feet. The thickness of the projected 
5 radiation beam at ten feet is preferably approxi- 
mately four inches if perfectly collimated. If diver- 
gent, the angle of divergence should be less than 
approximately two degrees. 

Inasmuch as the robot 10 typically operates in 

10 public areas it is desirable to minimize the visibility of 
the light to humans. Furthermore, since a silicon 
diode CCD camera 12 is presently preferred another 
consideration is the efficient use of the sensitivity of 
the camera 12. A wavelength range of 700 to 1000 

15 nanometers achieves both of these goals. A filter on 
the source and a like filter on the camera maximizes 
signal to noise ratio over ambient light. If the beam 
projectors 14 and 16 have a sufficiently narrow 
output spectrum substantially within the range of 

20 700-1000 nanometers, a filter is only required on the 
camera 12, the filter being matched to the spectrum 
of the source. 

Preferably the mesured brightness at the CCD 
Camera 12 of the illuminated region, at a range of 

25 2-10 feet and through a filter is two to five times 
greater than bright ambient light (corresponding to a 
brightly lit work area) from an incandescent light 
source, such as a 100 watt light bulb positioned five 
feet from a surface. 

30 A maximum useful duration of a pulse of output 
radiation is 33 milliseconds for a typical CCD camera 
image acquisition. Durations as short as one 
millisecond may be employed if the camera 12 
comprises an electronic shutter. 

35 A pulse repetition rate of the beam projectors is 
preferably at least two flashes per second, and may 
be as high as 10 per second or more when gathering 
detailed information on nearby objects. At higher 
repetition rates, lower power flashes may be em- 

40 ployed because of the shorter range to the object. 
Full power is generally required at repetition rates of 
four per second and slower. As was previously 
stated, control of the flash rate of the projectors 14 
and 16 is preferably accomplished by the Image 

45 processor 18. 

In accordance with the invention image process- 
ing performed by Image processor 18 and navigation 
control processor 20 generally involves the following 
steps or operations: 

50 (a) locating the image of light stripes rapidly 

in the image; 

(b) Inferring the range and bearing of objects 
from the located stripe images; 

(c) storing a geometric map representation of 
55 these object positions ; and 

(d) accessing and processing the map infor- 
mation with navigation algorithms and genera- 
ting control signals which result in avoidance of 
obstacles or navigation to reference landmarks. 

60 The first step (1) includes an image process- 

ing step of reducing the typically grey scale 
camera image to a binary image. If the struc- 
tured beam Is sufficiently bright to overpower 
ambient illumination, image intensity may be 

65 thresholded. Generally however, the structured 
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beam is not sufficiently bright to overcome all 
ambient radiation in which case the aforemen- 
tioned image subtraction technique may be 
employed. The result of this step of reducing 
the grey scale image to a binary image reduces 
the subsequent search for the structured 
radiation stripe within the image of the FOV to a 
less complex present/absent detection tech- 
nique. 

The first step (a) includes another image 
processing step which employs the use of a 
search algorithm which sucessively subdivides 
the size of steps or increments taken of the 
image during the search. That is, it scans the 
image rapidly at a coarse resolution and then 
searches at a finer resolution when detection of 
a pixel above threshold is encountered. This is 
one form of a binary search. 

Step (b) above, inferring position within the 
environment from image position, exploits the 
fixed mounting of the camera 12 and projectors 
14 and 16. Illumination of a particular pixel within 
the image for a . particular projector output 
implies a unique position within the invironment 
of an object reflecting the structured radiation. 
Each of these unique pixel related positions 
may be precaiculated and stored in the lookup 
tables 18F and 18G to enhance real-time 
operation or each position may be calculated as 
an illuminated pixel is detected. One preferred 
method of calculating range and bearing asso- 
ciated with each pixel will be described in detail 
below. 

Step (c) involves consolidating the individual 
determined range and bearing measurements 
into a geometric representation of the environ- 
ment, which includes motion of the robot 
relative to the environment. One technique 
which may be employed is to represent the floor 
26 as a two dimensional grid, to mark or 
designate grid cells which are occupied by 
detected objects, and to assign a degree of 
confidence in the visual measurement based on 
the persistence of detected objects at fixed 
position within a grid cell or cells. Figure 6 
illustrates such a map wherein the robot is 
moving in the direction of the v axis. The map 
which is stored in the navigation control 
processor 20 of the robot 10 is divided into cells 
which might typically be as small as one inch or 
as large as one foot on each side. When 
analysis of the image indicates the detection of 
an object at a particular (u,v) position, a 
confidence level C(u,v) is assigned to that 
position. This confidence level is increased as 
successive observations continue to detect a 
presence of the object at the same position. 
Confidence level ranges in value from 0.0 to 1.0. 
Figure 6 illustrates that an object has been 
detected and confidence levels assigned for 
occupied cells as follows: 
C(3,4) = 0.2, C(3,5) - 0.5, C(3,6) = 0.3, 
C(4,5) - 0.8. C(3,7) - 0.3, C(4,6) - 0.8, 
C(5,5) - 0.7. 

Another geometric representation may be 



derived by considering contiguous detections 
as a single objcet, and defining the position and 
radius of an enclosing circle as object parame- 
ters for purposes of navigation. The circle in 

5 Fig. 6 illustrates this representation, the circle 

having parameters defined as a Center(3,5) and 
a Radius equal to two grid cells. 

The optimum choice of coordinates for 
representing the map depends in part on the 

10 manner in which the map is to be used. Initial 

inference of object position from structured 
light vision in step (c) above yields polar 
coordinates. Other sensors, such as sonar, also 
yield polar coordinates, R and Theta. It may be 

15 advantageous to combine such multi-sensory 

data in the same polar coordinate representa- 
tion to generate confidence levels, prior to 
converting to x-y coordinates. Cartesian (x, y) 
coordinates are computationally advantageous 
20 for representing motion of the robot, which can 

be computed by vector addition without altering 
the x-y relations between the objects in the 
map. 

Whatever coordinate system is chosen for 

25 the map, two dimensions of position are 

derivable for objects using structured light. 
There is also a third dimension, elevation, which 
is available implicitly from the elevation of the 
light plane which intersects the object. This may 

30 be useful in discriminating tall objects from 

short ones. However, since the physical en- 
velope of the robot is substantially vertical, an 
object at any elevation is normally considered 
an obstruction to robot motion. Thus a two- 

35 dimensional map is generally sufficient for 

navigation purposes. 

Step (d) above inolves directing the robot 10 
along a path which avoids obstacles or which 
corresponds in a prescribed reference frame to 

40 visually measured objects. A variety of well 

known path planning techniques can be used. 
For example, if there is a prescribed goal path 
which is obstructed by an obstacle one strategy 
is to find an alternate path through free space 

45 which is the shortest path between the present 

position and a desired, goal position. 

Referring now to Figs. 3a and 3b it can be 
seen that obstacle avoidance and/or reference 
surface recognition relies on structured iight 

50 projection and detection. The reflected struc- 

tured light planes are superimposed upon the 
horizontal pixel planes of the camera. As an 
object approaches the robot, it is first seen at 
the top of the field of view (FOV). As it moves 

55 closer to the robot, it moves down in the 

camera 12 FOV. Each pixel in the FOV corre- 
sponds to a range (R) and a bearing angle 
(Theta) from the robot 10 to the object. 
Preferably, each R and Theta are pre-com- 

60 puted off-line and stored in a read only memory 

(ROM) which is permanently installed in the 
robot and which is accessed by microcomputer 
18D. Alternatively, when the robot 10 is first 
energized, a lookup table is compiled by image 

65 processor 18 from equations that determine R 
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and Theta for each individual camera pixel 
relative to the FOV of the environment. During 
operation the object detection algorithm sear- 
ches the image of the FOV of the camera 12 for 
reflections from objects. The R and Theta of any 
pixels that are bright enough to exceed a 
predetermined threshold value are detected as 
locations of objects and stored in a data 
structure which defines a map of the robot's 
environment, such as that depicted in Fig. 6. 

In a presently preferred embodiment of the 
invention the image is comprised of 512 x 480 
pixels, resulting in approximately 250,000 total 
pixels. By example, it may require the micro- 
computer 18D approximately 33 microseconds 
to read the value of one pixel from the video 
memory 18A. To read all of the pixels would 
require in excess of eight seconds. Thus, in 
order to provide for operation In a real-time 
manner not all pixels within each image are 
searched, but only every nth pixel is searched 
for a pixel exceeding a threshold which may 
indicate a possible object. As was previously 
stated a coarse resolution search is preferably 
performed in the upper region of the FOV, 
wherein objects appear while they are still far 
off, and a finer resolution search is performed in 
the lower region of the FOV where objects are 
seen nearer to the robot 10. 

A cylindrical coordinate system can be 
employed for plotting the position of objects 
with respect to the robot as illustrated in 
Fig. 3b, the origin of the coordinate system 
being the center of the robot. R is the distance 
or range from the center of the robot to the 
object and Theta is the angle to the object. A 
Theta of zero degrees corresponds to an axis 
which is normal to the front of the robot which 
passes through the center. An intermediate X, Y 
cartesian coordinate system is used in the 
calculations for obtaining R and Theta. The 
origin of this intermediate coordinate system is 
a point on the floor dircetly under the camera 
sensor, Y is a vertical axis and X is a horizontal 
axis which extends straight out in front of the 
robot. 

The first step in the analysis is to determine X, 

Y coordinates of the points where the centers 
of the light planes and multiple horizontal pixel 
planes intersect in the x-y plane, this can be 
seen in Fig. 3a as dots such as A and B along 
the lines labelled 14a and 16a which represent 
the upper and lower light planes, respectively. 

These intersection points can be found by 
determining the equations for both lines of 
interest and solving them simultaneously. The 
basis equation for a line is y = m*x + b where 
* denotes multiplication. 

It is known that the height from the floor of 
the lower projector 16 is H1. The height of the 
upper projector 14 is Hu. The camera height is 
He and the slopes for the individual horizontal 
pixel planes are denoted by CAMS. 
The equation for the lower projector 16 is: 

Y = HI (1) 



The equation for the camera 12 is 

Y = CAMS*x + He (2) 

Solving these equations simultaneously yields: 

x = (-HI + Hc)/(-CAMS),and (3) 
5 y = CAMS * (x + He). (4) 

The above equations are solved for each 

value of CAMS, that is, the slope of each 

individual horizontal pixel plane of the image. 

Initially the slope of the center-line of the 
10 camera 12 (pixel 239.5) may be first determined, 

then the slopes of pixels 0 to 239, and 240 to 

479 are found. 

The slope of the center-line of the camera 12 is 
slope « -Hc/CAMD, (5) 
15 where CAMD is a distance along the x-axis to a 

point where the center of the camera images 
the floor 26. The angle PHI of each pixel ray is 
PHI = atan (-Hc/CAMD) +/- (atan(i/240 * 
3.3/8.0)), (6) 

20 where i varies from 1 to 240 and is the number 

of pixels from the center of the image. The term 
3.3 is one half the sensor height in millimeters 
and the term 8.0 is the focal length of the 
camera lens in millimeters. Of course these 

25 terms, in addition to the number of the 

horizontal and vertical pixels, are specific to a 
particular camera and may have different values 
if another camera is employed. 
The slope for each pixel plane is given by 

30 CAMS = tan(PHl). (7) 

Once the x, y coordinate of the intersection 
point is known, the hypotenuse (h) from each x, 
y point to the camera sensor is found using the 
Phytagorean Theorem, where 

35 h = sqrt(x**2+ (Hc-y)**2). (8) 

where ** denotes exponentiation. 

The distance h from the camera 12 to the 
obstacle and also the distance x along the floor 
from the robot to the obstacle is now known for 

40 an object directly in front of the robot 10 at a 

Theta equal to zero degrees. 

Next the intersection points of the camera 12 
and the structured beams are found for objects 
that are other than directly in front of the robot 

45 10. The lateral distance from a centerline, where 

the pixel and light plane intersect, to points 
disposed to the left and the right is denoted by 
nx(i), where i is the number of pixels offset from 
the center. The slope to each intersection, as 

50 seen in the top view in Figure 3b, is given by: 

slope = ((i/256) * 4.4/8.0), (9) 
where i is the number of pixels from the center 
line, (up to 256), 4.4 is one half of the horizontal 
sensor dimension in millimeters, and 8.0 is the 

55 focal length. As before, the constant 4.4 is 

camera specific. 

The slope to each pixel can also be rep- 
resented as nx(i)/h, therefore: 
nx(i) = ((i/256) * 4.4/8.0) * h. (10) 

60 R and Theta for each pixel in the FOV can 

thereafter be determined in accordance with 
the equations: 

Theta = atan (nx(i) / (x 4- offset), and (11) 
R « nx(i) /sin (Theta), (12) 
65 where offset is the distance along the x-axis 
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from the camera image sensor plane to the 
center of the robot. As was previously stated, R 
and Theta for each pixel may be computed and 
stored in a lookup table prior to operation. 

Referring now to Figs. 4 and 5 there is shown 5 
another embodiment of a robot having a 
structured light visual navigation and obstacle 
avoidance system in accordance with the 
invention. Robot 30 has a plurality of structured 
light projectors including an upper projector 32, 10 
a lower projector 34, and a camera 36 which is 
disposed between the upper and lower projec- 
tors. Robot 30 further comprises a pair of 
structured Hght projectors 38 and 40 which are 
disposed on opposite sides of a camera 36 and 15 
in an elevated position therefrom. Projectors 38 
and 40 provide a planar beam pattern which is 
projected orthogonally to the horizontally pro- 
jected beam from projector 34. The planar 
beam pattern from upper projector 32 is 20 
projected obliquely downwards such that it 
intersects the floor 42 at a position in front of 
the robot 30. Other internal components of the 
robot 30 are as shown in Fig. 1. That is, the 
robot 30 comprises an image processor, a 25 
navigation control processor and a drive and 
steering controller. Drive and steering wheels 
44 are provided for moving the robot over the 
floor 42. 

The structured light planes 38a and 40a 30 
shown in Figures 4 and 5 are projected forward 
to intersect any objects in the two vertical 
planes bounding the robot's forward path 
through the environment. As seen in the 
illustrative field of view of Figure 7, the vertical 35 
lines 38b and 40b indicate the loci of successive 
intersections of light planes with vertical ob- 
jects at successive ranges, as seen from the 
camera. Thus range, bearing and elevation can 
be measured from pixel position using algo- 40 
rithms exactly analogous to those discussed 
previously with regard to horizontal planes of 
structured light, as is immediately obvious to 
those versed in the art. 

It can be seen in Fig. 8 that the camera view 45 
of the oblique structured light plane 32a of 
Figs. 4 and 5 is reflected from the floor 
substantially uniformly (32b) and horizontally 
when there is no obstruction or other feature 
closely adjacent to the floor. The image stripe 50 
remains at a fixed position on the screen 
regardless of robot motion so long as the floor 
42 is flat. This uniformity is broken by a 
depression, such as a hole within the floor, or 
by an obstacle closely adjacent to the floor. 55 

The depression generates an image with a 
break in the stripe 32b having a bright portion 
32c disposed below the break. An obastacle 
lying on the floor yields a break in the stripe 
having a bright portion 32d disposed above the 60 
break. Clearly, the magnitude of displacement 
of the bright portions 32c and 32d above and 
below the stripe 32b is a measure of range and 
elevation, and the position of the break is a 
measure of bearing, using algorithms exactly 65 



analogous to those discussed previously with 
regards to horizontal planes of structured light, 
as is also immediately obvious to those versed 
in the art. 

When multiple planes of structured light are 
used, as illustrated in Figures 4 and 5, their 
timing should be desynchronized so that there 
is no ambiguity in interpretation of which beam 
is to be associated with any particular pixel 
location. Furthermore, a separate lookup table 
may be associated with each structured light 
source. These lookup tables such as 18F and 
18G, are most conveniently stored in prepro- 
grammed ROM's (read-only-memories). 

For the embodiment of Figs. 4 and 6 the 
determination of R and Theta as a function of 
pixel position is accomplished in a manner 
substantially identical to that disclosed above in 
reference to the robot of Fig. 1 ; it being realized 
that suitable adjustments are made for the 
height and position of the camera having a 
horizontal, forward looking FOV and for the 
slope and relative position of the upper beam 
projector 32. 

It should be realized that the invention may be 
practiced with a variety of types of planar light 
projectors and with a variety of types of image 
sensors or cameras other than those disclosed 
above. For example, the invention may be 
practiced with a structured beam of visible light 
which is received by a vidicon camera. Further- 
more, the exact nature of the image processing 
algorithm may be modified while still achieving a 
substantially identical result. Thus, it should be 
further realized that those having ordinary skill 
in the art may derive a number of modifications 
to the embodiments of the invention disclosed 
above. The invention is therefore not to be 
construed to be limited only to these disclosed 
embodiments but it is instead intended to be 
limited only as defined by the breadth and 
scope of the appended claims. 



Claims 

1 . Object dectection apparatus carried by a 
vehicle which moves over a surface, compris- 
ing: 

means for transmitting at least one structured, 
substantially planar radiation beam in a direc- 
tion of travel of the vehicle; 
means for detecting the planar radiation beam 
which is reflected from at least one surface 
disposed within a region through which the 
vehicle is to travel ; and 

means for associating the detected radiation 
beam with at least a range and a bearing, 
relative to the vehicle, of the surface reflecting 
the planar radiation beam. 

2. Object detection apparatus as defined in 
claim 1 and further comprising means, coupled 
to an output of said associating means, for 
determining vehicle navigation data as a func- 
tion of the range and bearing to the surface. 
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3. Object detection apparatus as defined in 
claim 1 wherein said means for transmitting 
comprises a pulsed source of radiation. 

4. Object detection apparatus as defined in 
claim 3 wherein said source of radiation has 
output wavelengths within a range of wave- 
length of approximately 700 nm to approxi- 
mately 1000 nm. 

5. Object detection apparatus as defined in 
claim 1 wherein said means for detecting 
comprises means for generating a two-dimen- 
sional field of view comprising a plurality of 
pixels, each of said pixels having an associated 
value which is a function of the amount of the 
reflected structured radiation whlthln an associ- 
ated portion of the field of view. 

6. Object detection apparatus as defined in 
claim 4 wherein said means for detecting 
comprises a CCD imaging device which gener- 
ates a two-dimensional field of view comprising 
a plurality of pixels. 

7. Object detection apparatus as defined in 
claim 5 wherein said means for associating 
comprises means for generating a range and a 
bearing for each pixel within the field of view. 

8. Object detection apparatus as defined in 
claim 7 wherein said associating means further 
comprises means for storing data expressive of 
the range and bearing for each pixel within the 
field of view. 

9. A method of generating navigation related 
information for a mobile robot which moves 
upon a supporting surface, comprising the 
steps of: 

projecting at least one structured, substantially 
planar radiation beam into an environment in 
front of and including a desired path of the 
robot, the radiation beam forming a stripe-like 
pattern on a surface disposed within the 
environment; 

generating a two-dimensional image of the 

environment, the image including at least the 

stripe-like pattern which reflects from a surface, 

if any, within the environment; 

locating the image of a stripe-like pattern in the 

image; 

inferring from the position within the image of 
the stripe-like pattern a range and a bearing of 
the surface relative to the robot; 
generating a geometric map representation of 
the inferred surface position; and 
processing the map information to generate 
robot motion control signals which result in 
avoidance of obastacles or navigation to ref- 
erence landmarks. 

10. A method as defined in claim 9 wherein the 
step of generating generates an image com- 
prised of a two-dimensional array of pixels, 
each of the pixels having an associated value 
which is a function of the amount of the 
reflected radiation within an associated portion 
of the image. 

11. A method as defined in claim 10 wherein 
the step of locating includes a step of determin- 
ing the location within the array of pixels of one 



or more pixels which have a value equal to or 
greater than a predetermined threshold value. 

12. A method as defined in claim 11 wherein 
the step of inferring includes a step of access- 

5 ing an entry within a data structure having a 

plurality of range and bearing entries, each of 
the entries corresponding to one of the pixels, 
the location of the entry being accessed being a 
function of the determined pixel location. 

10 13. A method as defined in claim 11 wherein 

the step of inferring includes a step of convert- 
ing a value of each of the pixels to a binary 
value. 

14. A method as defined in claim 11 wherein 
15 the step of location is accomplished by first 

searching the image with a resolution of n pixels 
to first locate a pixel or pixels having a value 
equal to or greater than the predetermined 
threshold value and is further accomplished by 
20 searching the image with a resolu tion of m 

pixels after the pixel or pixels are located, and 
wherein n is greater than m. 

15. A method as defined in claim 9 wherein the 
step of generating a geometric map includes 

25 the steps of: 

representing the supporting surface as a two- 
dimensional grid comprised of grid cells; and 
designating those grid cells which corresponds 
to a range and a bearing, relative to the robot, of 

30 the stripe-like pattern reflecting surface. 

16. A method as defined in claim 15 and further 
comprising the step of: 

assigning a confidence factor to designated 
grid cells based upon a persistence of inferred 
35 ranges and bearings which correspond to the 

designated cells. 

17. A method as defined In claim 16 and further 
comprising a step of: 

determing a radius which encloses one or more 
40 designated grid cells. 

18. A vision system coupled to a mobile robot, 
comprising: 

means for transmitting at least one structured, 
substantially planar radiation beam in a direc- 

45 tion of travel of the robot ; 

means for detecting the planar radiation beam 
which is reflected from at least one surface 
disposed within a region through which the 
robot is to travel; 

50 means for associating the detected radiation 

beam with a range and a bearing, relative to the 
robot, of the surface reflecting the planar 
radiation beam; and 

means, coupled to an output of said associating 
55 means, for determining robot navigation data as 

a function of the range and bearing to the 
surface. 

19. A vision system as defined in claim 18 
wherein said means for transmitting comprises 

60 aflashlamp. 

20. A vision system as defined in claim 18 
wherein said means for transmitting comprises 
one or more light emitting diodes. 

21. A vision system as defined in claim 18 
65 wherein said means for transmitting comprises 
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one or more incandescent lamps. 

22. A vision system as defined in claim 18 
wherein said means for transmitting comprises 
one or more laser means. 

23. A vision system as defined in claim 18 5 
wherein the beam is transmitted substantially 
parallel to a substantially planar surface over 
which the robot moves. 

24. A vision system as defined in claim 18 
wherein the beam is transmitted obliquely 10 
downward relative to a substantially planar 
surface over which the robot moves. 

25. A vision system as defined in claim 18 
wherein the beam is transmitted substantially 
perpendicularly relative to a substantially planar 15 
surface over which the robot moves. 

26. A vision system as defined in claim 18 
wherein said means for detecting comprises a 
CCD imaging means. 

27. A vision system as defined in claim 18 20 
wherein said means for detecting comprises a 
vidicon imaging means. 

28. A vision system as defiend in claim 19 
wherein said flashlamp comprises a fiashlamp 
having an elongated shape and wherein said 25 
means for transmitting further comprises in 
combination: 

a substantially semicircular cylindrical reflector 
means disposed substantially parallel to said 
flashlamp; 30 
first cylindrical lens means having a focal length 
and being disposed substantially parallel to said 
flashlamp for focussing radiation emanating 
from said fiashlamp and also from said reflector 
means, and 35 
second cylindrical means having a focal length 
and being disposed substantially parallel to said 
first cylindrical lens means at a distance which 
is greater than the focal length of said first 
cylindrical lens means, said second cylindrical 40 
lens means colli mating the radiation from said 
first cylindrical lens means for generating said 
substantially planar radiation beam. 

29. A vision system as defined in claim 28 
wherein said flashlamp comprises a xenon 45 
flashlamp. 

30. A vision system as defined in claim 28 
wherein said first and said second cylindrical 
lens means each have a focal length of 
approximately 0.5 inches. 50 

31. A vision system as defined in claim 28 
wherein each of said cylindrical lens means 
comprise a Fresnel lens. 

32. A vision system as defined in claim 28 
wherein said cylindrical reflector means has a 55 
focal length and wherein said flashlamp is 
disposed at approximately the focal length of 

said cylindrical reflector means. 

33. A vision system as defined in claim 28 and 
further comprising an optical filter means 60 
disposed to receive said collimated radiation, 

said optical filter means passing radiation 
therethrough within a range of wavelengths of 
approximately 700 nm to approximately 1000 
nm. 65 



34. A vision system as defined in claim 19 
wherein said flashlamp comprises a flashlamp 
having an elongated shape and wherein said 
means for transmitting further comprises in 
combination: 

a substantially semicircular cylindrical reflector 
means disposed substantially parallel to said 
flashlamp; and 

at least one slit-like aperture disposed substan- 
tially parallel to said flashlamp for forming 
radiation emanating from said flashlamp and 
said reflector means into a substantially planar 
beam of radiation. 
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